An Empirical Etudy of Non-Lexical Extensions to Delexicalized Transfer

نویسندگان

  • Anders Søgaard
  • Julie Wulff
چکیده

We propose a simple cross-language parser adaptation strategy for discriminative parsers and apply it to easy-first transition-based dependency parsing (Goldberg and Elhadad, 2010). We evaluate our parsers on the Indo-European corpora in the CoNLL-X and CoNLL 2007 shared tasks. Using the remaining languages as source data we average under-fitted weights learned from each source language and apply the resulting linear classifier to the target language. Of course some source languages and some sentences in these languages are more relevant than others for the target language in question. We therefore explore improvements of our cross-language adaptation model involving source language and instance weighting, as well as unsupervised model selection. Overall our cross-language adaptation strategies provide better results than previous strategies for direct transfer, with near-linear time parsing and much faster training times than other approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Distributed Representation-Based Framework for Cross-Lingual Transfer Parsing

This paper investigates the problem of cross-lingual transfer parsing, aiming at inducing dependency parsers for low-resource languages while using only training data from a resource-rich language (e.g., English). Existing model transfer approaches typically don’t include lexical features, which are not transferable across languages. In this paper, we bridge the lexical feature gap by using dis...

متن کامل

A Representation Learning Framework for Multi-Source Transfer Parsing

Cross-lingual model transfer has been a promising approach for inducing dependency parsers for lowresource languages where annotated treebanks are not available. The major obstacles for the model transfer approach are two-fold: 1. Lexical features are not directly transferable across languages; 2. Target languagespecific syntactic structures are difficult to be recovered. To address these two c...

متن کامل

Parsing Natural Language Sentences by Semi-supervised Methods

We present our work on semi-supervised parsing of natural language sentences, focusing on multi-source crosslingual transfer of delexicalized dependency parsers. We first evaluate the influence of treebank annotation styles on parsing performance, focusing on adposition attachment style. Then, we present KLcpos3 , an empirical language similarity measure, designed and tuned for source parser we...

متن کامل

Language Transfer Learning for Supervised Lexical Substitution

We propose a framework for lexical substitution that is able to perform transfer learning across languages. Datasets for this task are available in at least three languages (English, Italian, and German). Previous work has addressed each of these tasks in isolation. In contrast, we regard the union of three shared tasks as a combined multilingual dataset. We show that a supervised system can be...

متن کامل

Supervised All-Words Lexical Substitution using Delexicalized Features

We propose a supervised lexical substitution system that does not use separate classifiers per word and is therefore applicable to any word in the vocabulary. Instead of learning word-specific substitution patterns, a global model for lexical substitution is trained on delexicalized (i.e., non lexical) features, which allows to exploit the power of supervised methods while being able to general...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012